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Abstract 



^ , Logical formalisms such as first-order logic (FO) and fixpoint logic (FP) are well suited 



to express in a declarative manner fundamental graph functionalities required in distributed 
systems. We show that these logics constitute good abstractions for programming distributed 
systems as a whole, since they can be evaluated in a fully distributed manner with reasonable 
Cn[ , complexity upper-bounds. We first prove that FO and FP can be evaluated with a polynomial 

number of messages of logarithmic size. We then show that the (global) logical formulas can 
be translated into rule programs describing the local behavior of the nodes of the distributed 
system, which compute equivalent results. Finally, we introduce local fragments of these logics, 
which preserve as much as possible the locality of their distributed computation, while offering 
a rich expressive power for networking functionalities. We prove that they admit tighter upper- 
bounds with bounded number of messages of bounded size. Finally, we show that the semantics 
and the complexity of the local fragments are preserved over locally consistent networks as well 
as anonymous networks, thus showing the robustness of the proposed local logical formalisms. 



> 

1 Introduction 

cn 

■r^lj- ■ Logical formalisms have been widely used in different fields of computer science to provide high- 

! level programming abstractions. The relational calculus used by Codd to describe data-centric 

applications in an abstract way, is at the origin of the technological and commercial success of 
relational database management systems |16) . Datalog, an extension of Horn clause logic with 
fixpoints, has been widely used to specify functionalities involving recursion |17j . 
^ ' The development of distributed applications over networks of devices is generally a very tedious 

• task, involving handling low level system details. The lack of high-level programming abstraction 

has been identified as one of the roadblocks for the deployment of networks of cooperating objects 

m- 

Recently, the use of queries to define network applications has been considered. Initially, the 
idea emerged in the field of sensor networks. It was suggested to see the network as a database, 
and interact with it through declarative queries. Several systems have been developed, among 
which Cougar [9] and TinyDB [14j, supporting SQL dialects. Queries are processed in a centralized 
manner, leading to distributed execution plans. 
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More recently, query languages were proposed as a mean to express communication network 
problems such as routing protocols [13] and declarative overlays [12]. This approach, known as 
declarative networking is extremely promising for it offers a high-level abstraction to program net- 
works. It was also shown how to use recursive queries to perform diagnosis of asynchronous systems 
[1], network monitoring [18], as well as self-organization protocols [lOj. Distributed query languages 
provide new means to express complex network problems such as node discovery [3], route finding, 
path maintenance with quality of service [6j, topology discovery, including physical topology 
etc. 

However, there is a lack of systematic theoretical investigations of query languages in the dis- 
tributed setting, in particular on their semantics, as well as the complexity of their distributed 
computation. In the present paper, we consider a distributed evaluation of classical query lan- 
guages, namely, first-order logic and fixpoint logic, which preserves their classical semantics. 

First-order logic and fixpoint logic have been extensively investigated in the context of database 
theory [2] as well as finite model theory [7] . Since the seminal paper of Fagin [8] , showing that the 
class NP corresponds exactly to problems which can be expressed in existential second-order logic, 
many results have linked Turing complexity classes with logical formalisms. Parallel complexity has 
also been considered for first-order queries which can be evaluated in constant time over circuits 
with arbitrary fan-in gates |llj . 

This raised our curiosity on the distributed potential of these classical query languages to express 
the functionalities of communication networks, which have to be computed in a distributed manner 
over the network itself. If their computation can be distributed efficiently, they can form the basis 
of a high level abstraction for programming distributed systems as a whole. 

We rely on the classical message passing model [4J. Nodes exchange messages with their neigh- 
bors in the network. We consider four measures of complexity: (i) the in-node computational 
complexity, rarely addressed in distributed computing; (ii) the distributed time complexity; (iii) 
the message size; and (iv) the per-node message complexity. The behavior of the nodes is governed 
by an algorithm, the distributed query engine, which is installed on each node, and evaluates the 
queries by alternating local computation and exchange of queries and results with the other nodes. 

We first consider the distributed complexity of first-order logic and fixpoint logic with infla- 
tionary semantics, which accumulates all the results of the different stages of the computation. 
Note that our result carry over for other formalisms such as least fixpoint. We prove that the 
distributed complexity of first-order queries is in O(logn) in-node time, 0(A) distributed time (A 
is the diameter of the network), messages of size O(logn), and a polynomial number of messages 
per node. For fixpoint, a similar bound can be shown but with a polynomial distributed time. 

We then consider the translation of logical formulae that express properties of graphs at a 
global level, into rule programs that express the behavior of nodes at a local level, and compute the 
same result. We introduce a rule language, Netlog, which extends Datalog, with communication 
primitives, and is well suited to express distributed applications, ranging from networking protocols 
to distributed data management. Netlog is supported by the Netquest system, on which the 
examples of this paper have been implemented. We prove that graph programs in Datalog"' [2] can 
be translated to Netlog programs. Since it is well known that first-order and fixpoint logics can be 
translated in Datalog"' [7], it follows that global logical formulae can be translated in behavioral 
programs in Netlog producing the same result. 

Finally, we define local fragments of first-order and fixpoint logic, respectively FOioc and FPioc- 
These fragments provide a good compromise in the trade-off between expressive power and efficiency 
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of the distributed evaluation. Important network functionalities (e.g. spanning tree, on-demand 
routes etc.) can be defined easily in -FP/oc- Meanwhile, its complexity is constant for all our 
measures, but the distributed time which is linear in the diameter for FOioc and in the size of the 
network for FPioc- 

Our results shed light on the complexity of the distributed evaluation of queries. Note that 
if the communication network is a clique (unbounded degree), our machinery resembles Boolean 
circuits, and we get constant distributed time, a result which resembles the classical AC'^ bound 

m- 

We have restricted our attention to bounded degree graphs and synchronous systems. Most of 
our algorithms carry over, or can be extended to unrestricted graphs, and asynchronous computa- 
tion, but not necessarily the complexity bounds. Interestingly, the results for the local fragments 
carry over for other classes of networks, such as locally consistent networks or anonymous networks, 
thus showing the robustness of the languages FOioc and FPioc- 

The paper is organized as follows. In the next section, we recall the basics of first-order and 
fixpoint logics. In Section[3l the computation model is presented. Section|4]is devoted to distributed 
first-order query execution, and Section [5] to fixpoint query execution. In Section [6l we introduce a 
behavioral language, Netlog, and show that FP formulae can be translated into equivalent Netlog 
programs. In Section [71 we consider the restriction to the local fragments, and show that they can 
be evaluated over different types of networks. 

2 Graph logics 

We are interested in functions on graphs that represent the topology of communication networks. 
We thus restrict our attention to finite connected bounded- degree undirected graphs. Let D be the 
bound on the degree. 

We assume the existence of an infinite ordered set of constants, U, the universe of node Id's. A 
graph, G = {V, G), is defined by a finite set of nodes V C U, and a set of edges G ^ V x V. 

We express the functions on graphs as queries. A query of arity £ is a computable mapping from 
finite graphs to finite relations of arity i over the domain of the input graph closed under graph 
isomorphisms. A Boolean query is a query with Boolean output. 

Logical languages have been widely used to define queries. A formula ip over signature G with 
i free variables defines a query mapping instances of finite graphs G to relations of arity £ defined 
by: A = {(xi, . . . , x^)|G \= if{xi, . . . , xi)}. We equivalently write G, A\= ip. 

We denote by FO the set of queries definable using first-order formulae. First-order queries can 
be used in particular to check locally forbidden configurations for instance. Their expressive power 
is rather limited though. 

Fixpoint logics on the other hand allow to express fundamental network functionalities, such as 
those involving paths. If <p{T;xi, ■■■,xi) is a first-order formula with £ free variables over signature 
{G,T}, where T is a new relation symbol of arity £, called the fixpoint relation, then ii{ip{T)) 
denotes a fixpoint formula whose semantics is defined inductively as the inflationary flxpoint /, of 
the sequence: 

h = 0; 

h+i = ip{Ii)Uli,i>0 
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where ip{Ii) denotes the result of the evaluation of '^{T) with T interpreted by /j. The /j's constitute 
the stages of the computation of the fixpoint. We write G, / |= fi{ip{T)), whenever I is the fixpoint 
of the formula (p{T) as defined by the above induction. 

It is well know [2] that on ordered domains, the class of graph queries defined by inflationary 
fixpoint, denoted FP, captures exactly all Ptime mappings, that is mappings that can be computed 
on a Turing machine in time polynomial in the size of the graph. 

The following examples illustrate the expressive power of FP for distributed applications. 

The formula fi{(p{T){x, h, d)) for instance where the formula (p{T){x, h, d) is defined by: 

{G{x, h) Ah = d)V h) A 3z{T{h, z,d) Ax^ z) A -n3uT{x, u, d)) 

defines a table-based routing protocol (OLSR like) on the graph G, where h is the next hop from 
X to destination d. 

A spanning tree from a node x satisfying ReqNode{x) can be defined by a fixpoint formula 
fj,{ip{ST){x,y)), where the formula ip{ST){x,y) is defined by: 
{G{x, y) A ReqNode{x))V 

{^3x'ST{x', y) A 3w{ST{w, x)Aw^y)A G{x, y) A 'iw"ix' {ST{w' , x) A G{x', y) ^ x' > x)) 

Similarly, an On-Demand Routing protocol (AODV like), can be defined by the fixpoint queries 
fj.{ip{RouteReq){x, y, d)) and fi{ip{NextHop){x, y, d)), where d is a constant and ip{RouteReq){x, y, d) 
is defined by: 

{G{x,y) A ReqNode{x) A dest{d))y 

{3w{RouteReq{w , x,d) Aw ^ y) A G{x, y)A x ^ d A -^3w' RouteReq{w' , y, d)) 

and ilj{NextHop){x,y,d) is defined by: 

{RouteReq{x, d,d) Ay = d)\J {3zNextHop{y, z, d) A RouteReq{x, y, d)) 

where a route request is first emitted by a node x satisfying ReqNode{x), then a path defined by 
next hops from that node to destination d is established by backward computation on the route 
request. 

3 Distributed evaluation 

We are interested in this paper in the distributed evaluation of queries. We assume that each query 
to the network is posed by a requesting node (the node satisfying the predicate ReqNode{x) in the 
examples of the previous section). 

The result of a query shall be distributed over all the nodes of the network. In a query 
Q{xi,X2, • • • , xi), one of the attributes Xi denotes the holding node, written explicitly as @Xi, that is 
the node which holds the results relative to Xj. More precisely, the tuple (oi, • • • , aj_i, a, Oj+i, • ■ ■ , a^) 
is held by node a, such that Q{ai, • ■ ■ , Oj-i, a, Oj+i, • • • , a^) holds. For simplicity, we will choose 
the first variable as holding attribute. 

The results of fixpoint queries are thus distributed on holding nodes. In the OLSR like example 
of the previous section, each node shall hold its routing table as a result of the evaluation of the 
query. 
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The nodes of the network are equipped with a distributed query engine to evaluate queries. 
It is a universal algorithm that performs the distributed evaluation of any network functionality 
expressed using queries. The computation relies on the message passing model for distributed 
computing [4J. 

The configuration of a node is given by a state, an in-buffer for incoming messages, an out-buffer 
for outgoing messages, and some local data and metadata used for the computation. We assume 
that the metadata on each node contain a unique identifier, the upper bound on the size of the 
network, n, and the diameter of the network, A. We also assume that the local data of each node 
includes all its neighbors with their identifiers. 

We distinguish between computation events, performed in a node, and delivery events, performed 
between nodes which broadcast their messages to their neighbors. A sequence of computation events 
followed by delivery events is called a round of the distributed computation. 

A local execution is a sequence of alternating configurations and events occurring on one node. 
We assume that the network is static, nodes are not moving, and that the communication has no 
failure. 

We assume that at the beginning of the computation of a query, all the nodes are idle, in initial 
state, with their in-buffers, and out-buffers empty. Note that, it is easy to extend the present 
computational framework to a multithreaded computation with several concurrent queries running 
in the network. The requesting node broadcasts its query to its neighbors. The incoming messages 
in the subsequent nodes trigger the start of their query engine computation. 

The evaluation of a query terminates when the out-buffers of all nodes are empty. The result 
is distributed over the network in the memories of all nodes. Note that alternative termination 
modes are also possible. 

We consider four measures of the complexity of the distributed computation: 

• The per-round in-node computational complexity, IN-TIME/ROUND, is the maximal compu- 
tational time of the in-node computation in one round; 

• The distributed time complexity, DIST-TIME, is the maximum number of rounds of any local 
execution of any node till the termination; 

• The message size, MSG-SIZE, is the maximum number of bits in messages; 

• The per-node message complexity, ^^MSG/NODE, is the maximum number of messages sent 
by any node till the termination of the evaluation. 

There is a trade-off between the in-node computation and the communication. Our objective is to 
distribute the workload in the network as evenly as possible, with a balanced amount of computation 
and communication on each node. Clearly, centralized computation can be carried on by loading 
the topology of the network on the requesting node, and performing the evaluation by in-node 
computation. The centralized evaluation of FO and FP admits the following complexity bounds. 

Proposition 1. Let G be a network of diameter A, with n nodes. Let (p be a FO formula with v 
variables. The complexity of the centralized evaluation of the query ip on G is given by: 

IN-TIME/ROUND DIST-TIME MSG-SIZE # MSG/NODE 
O(nMogn) 0(A) O(logn) 0(n) 

Suppose ^{ip{T){xi, . . . ,X()) is a FP formula such that T is a relational symbol of arity £, and 
it contains v = H. + k variables (i free and k bounded). Then the complexity upper-bound of the 
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centralized evaluation of the query n{ip(T){xi, . . . , xe)) on G is the same as the above complexity 
for FO formulae except for the IN-TIME/ROUND which is in 0{n^^'" logn). 

Note that all nodes, but the requesting node, have O(logn) per-round in-node complexity. The 
proof of this result follows from classical results on data complexity of query languages [2] . In the 
sequel, we focus exclusively on distributed query evaluation. 

4 Distributed complexity of FO 

In this section we show that the distributed evaluation of FO can be done with a polynomial 
number of messages but logarithmic in-node computation per round. The result relies on a naive 
distributed query engine for FO, Q£fo^ which works as follows. 

The requesting node starts the computation by submitting a query. The nodes broadcast 
Boolean answers to queries when they have them, and otherwise queries they cannot answer, to their 
neighbors. Each node reduces queries by instantiating variables. In QEpo^ nodes start instantiating 
from the leftmost quantified variable, and from the rightmost free variable. The last instantiated 
free variable therefore denotes the holding node of the query, on which the corresponding tuples 
will be stored. The nodes simplify the queries by removing all facts, or subformulae they can fully 
evaluate. 

Let (/J be a first-order formula with I free variables. The query engine handles the following 
message types: message \p.Bip\ for Boolean queries, message {?xi . . .Txjaj+i . . .\anf\ for non- 
Boolean queries, and message \}-Bip\ for answers of Boolean queries. 

Each node stores pairs {query, parentquery), in a query table, associating the query being 
evaluated to the query from which it derives. Nodes also store the Boolean answers \B(p and 
non-Boolean answers (ai . . . ai) to queries in an answer table. 

We will see that the diameter A of the graph induces an upper-bound on the response time of 
queries. The algorithm uses clocks that are defined according to this upper-bound. Clocks are 
associated to the evaluation of queries as well as subqueries. After the time of a clock associated to 
a query on a node has elapsed, the value of the query can be determined by the node. From now 
on, we assume that we are given a clock compliant with the communication graph. The value of 
the clocks will be defined in Definition [1] below. 

The main steps of the query engine work as follows. Note that we assume for simplicity in 
the sequel that the system is synchronous. This assumption can be relaxed easily in asynchronous 
systems without impact on the complexity by using spanning trees rather than the clocks. 

Initial Boolean query emission For a Boolean query, the requesting node, say a, broadcasts 
the query, IBip, adds C^Bip, nil) into the query table, and sets a clock for the answer. Meanwhile it 
instantiates the leftmost bounded variable and produces a subquery. For an existentially quantified 
formula 3x'i/', iiip{a) is true then it is a witness that 3x^l^ is true. For a universally quantified formula 
Vxip, if ip{a) is false then it is a counterevidence and \/xip is false. If the node doesn't have the answer 
to ip{a), it inserts ip{a) along with its parent query into the query table, e.g.{?Bilj{a),?B3xip), 
broadcasts ip{a) and also sets a clock for tp{a). If no witness / counterevidence is received before 
the clock elapses, then 3xtp is false / Vxip is true. It then recursively handles ip{a) in the same way. 

Boolean query reception Every node upon reception of a Boolean query, IBip, checks at first 
its query table. If there is a record for this query, it does nothing. Otherwise its behavior is similar 
to the Boolean query emission of the requesting node, with the difference that it also broadcasts 
the answer. 
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Boolean answer reception Every node receiving an answer to a Boolean query, IBip, ehecks 
its answer table. If there is a record, it does nothing. Otherwise, it stores the answer, checks the 
query table. If it is waiting for the answer, it then tries to evaluate the parent query (if it has one), 
stores and broadcasts its answer if it has; if it is not waiting for the answer, it broadcasts [B(p. 

Initial non-Boolean query emission The requesting node submits and broadcasts the query 
?xi . . .?X£ip{xi, . . . , xi). It sets the clock, inserts . . .?X£{p{xi, . . . , X£),nil) into the query table, 
instantiates the rightmost free variable to get the subquery, which is 7xi . . .7x^-i\aip{xi , . . . , a), 
and broadcasts it. Meanwhile the subquery is inserted into the query table and handled further by 
the requesting node. When all the free variables are instantiated, the Boolean query lB(p{ai ■ ■ ■ ag) 
is emitted and a record {lBip(a\ . . . ag), \ai . . .\ai(p{ai . . . ai)) is inserted in the query table of node 
ai. 

Non-Boolean query reception Every node checks its query table when it receives a query 
7xi . . .Ixi-iltti . . .lanp{xi, . . . , Xi-i, ai, . . . , ai). If there is a record in the table, it does nothing. Oth- 
erwise, it stores . . .?a;j_i!aj . . .la£(p{xi, . . . , .Tj_i, Oj, . . . , a(),nil) in the query table, its behavior 
is then similar to the initial non-Boolean query emission with i — 1 free variables. 

Distributed tuple answer collection If the Boolean query 7Bip{ai, . . . ,ae) receives a pos- 
itive answer to it, and there is a record {?Bip{ai . . . a^), !ai . . .\a((f{ai . . . ai)) in the query table, 
(ai, . . . , a^) is stored in the answer table of the current node which corresponds to the instantiation 
of the leftmost free variable, that is the holding node for the answer. 

We now turn to the clocks which parameterize the first-order query engines. The following theorem 
provides an upper-bound on the distributed time complexity of the evaluation of a formula. 

Theorem 1. For networks of diameter A, the distributed time complexity of the evaluation of a 
formula with w variables or constants by QSpo is bounded by 2 Aw. 

Proof. The proof is done by induction on the number of variables and constants in the query ip. 

Basis: Assume w = 2. There are three possibilities: two constants, or two variables, or one 
constant and one variable in the query ip. 

• If there are two constants, say a and b, the query i/j is propagated to a and gets the value of 
the atom G{a, b) which takes at most A rounds. Then the answer of ip is sent back to the 
requesting node which takes at most A rounds. The total time is at most 2A rounds. 

• If there are one variable x and one constant a m tp, the query is propagated to every node at 
which the variable is instantiated and we get the answers of G{x,a), which takes A rounds. 

- When the variable is free, the answer is stored in the local table of x. 

- When the variable is bounded, the witness/counter evidence of ip is sent back to the 
requesting node which takes at most A rounds. Or if after A rounds, the requesting 
node does not receive any sub-answer, it is sound to consider that there are no witnesses 
or counterevidences. 

So the total time is 2 A rounds in both cases. 

• If there are two variables then it takes A rounds to instantiate one variable at every node 
(suppose the formula obtained is rj) and then A rounds for the other variable (suppose the 
formula obtained is ^). Therefore 2 A in all. 
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- If both of the variables are free variables, if ^ is true, then the tuple is stored in the local 
table. 

- If the first variable is free and the second one is bounded, then it takes A rounds for 
the witness / counter evidence (if there is one) to get to the first instantiating node from 
the second one, if rj is true, suppose a is the instantiation of the free variable, then the 
answer is stored in the local table. 

- If both variables are bounded, then it takes A rounds for the answer to get to the first 
instantiating node and then A to the requesting node. 

So the total time is 4 A rounds. 

Therefore, for w = 2, the time is bounded by 2Aw rounds. 

Induction: Suppose that when the sum of variables and constants is w, e.g. there arc / free 
variables, k bounded variables, c constants and w = l + k + c, the time is bounded by 2Aw rounds. 
We prove the result for tt; + 1 

• when there are c + 1 constants: there are A rounds (at most) for the sub-query to get to 
the additional constant node and A rounds for the answer to the sub-query getting back. 
Therefore the total time is at most 2A{w + 1) rounds. 

• when there are k + 1 bounded variables: w.l.o.g. we assume that the additional bounded 
variable is the leftmost bounded variable, then A rounds are sufficient before instantiating 
the second variable to instantiate the first variable, and A rounds for the answer getting to 
the first instantiating node from the second one. Therefore the total time is at most 2A{w + l) 
rounds. 

• when there are I + 1 free variables: it takes A rounds for instantiating the additional free 
variable. So the total time is 2Aw + A. 

Therefore, the distributed time time is bounded by 2A{w + 1). □ 

We can now settle the values of the clocks in the query engine. 

Definition 1. The value of the clock in a network of diameter A, for an FO query with w variables 
or constants is 2 Aw. 

The next result shows the robustness of the algorithm: its independence from the order in which 
messages are handled by the query engine. 

Proposition 2. The distributed first-order query engine is insensitive to the order of the incoming 
messages in a round. 

Proof. There are two fundamental steps in the algorithm of the query engine: query propagation 
and result construction. During query propagation, queries and subqueries arriving on one node 

have no interaction. They generate entries in the query table. During result construction, results 
of independent queries do not interfere, and results of the same query are handled with a set 
semantics. □ 

We can now define the distributed inference. 
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Definition 2. Let G be a graph, ip a formula with I free variables, and A a finite relation of arity 
i. We write G,A l-po V' ^/ '^f^d only if A is the union of all the answers produced by the query 
engine QSpo on all nodes, upon request of ip from any node. 

We next prove the soundness and completeness of the query engine. 

Theorem 2. For any network G of diameter at most A, and any first-order formula ip, G, A \= t/j 

if and only if G, A \-fo V*- 

Proof. First observe that it is sufficient to prove the result for Boolean formulae. Indeed, if there 
are £ free variables in the query, they get instantiated by all possible instantiation when the 
query travels around the system of n nodes, resulting in Boolean first-order queries. The result 
of each query (tuple of £ constants) is then stored at the key node if it satisfies the Boolean query. 

The result is also rather obvious for variable-free formulae. Suppose that a query has c (c > 2) 
constants and no variables. The query is broadcasted to every node and once it successively reaches 
nodes, it gets the Boolean value for the atoms containing the corresponding constants, replaces the 
corresponding atoms by their value and produces a new query which is broadcasted again. The 
result is obtained when the query has reached (at most) c — 1 of the constants. Then the answer 
is sent back to the requesting node. The total time required is at most 2A(c — 1). The clock time 
being fixed at 2Ac rounds, it is suffices to get the result. 

The rest of the proof is done by induction on the number of bounded variables for Boolean 
formulae. 

Basis: Assume the query has one bounded variable. Then it must has at least one constants, 
so c > 1. First it is broadcasted by the requesting node and the variable is instantiated by every 
node, thus producing n sub-queries with at most c+l constants After the sub-queries reach at most 
c — 1 of the constants (note that one of the constants stems from instantiating the variable and the 
sub-queries gets it immediately at the instantiating node) and get their answers, the witness for 3 or 
the counterevidence for V is sent back to the requesting node which then produces the final answer. 
If no witnesses/counterevidences are received before the clock time elapses, a negative/positive 
answer is produced by the requesting node. 

Induction: Assume that if the query has k {k > 2) bounded variables and c constants, i.e. 
the query is in the form: 

tpk = Aixi . . . A^Xkfixi . . . Xk) (denoting 3 or V by A), then G \= ipk if and only if G l-po V'ifc- 
We prove the result for the case when there are k + 1 bounded variables in the query 

tpk+i = Aixi . . . Ak+iXk+i(p{xi . . . Xk+i) 

After the first variable has been instantiated at each node, the n sub-queries of the form 

V'fe = A2X2 . . . Ak+iXk+lip{x2 . . . Xk+l) 

are queries with k bounded variables and c+l constants. They are then further propagated by 
the instantiating node. By induction assumption, G \= ip'j^ if and only if G hpo ^Pk^ so every node 
gets a sound answer to ip'f^. After one instantiating node gets the answer to it sends the answer 
to the requesting node. If it is true and Ai is 3 then the requesting node takes it as a witness 
and V'fc+i is true; if it is false and Ai is V then the requesting node takes it as a counterevidence 
and tpk+i is false. If the requesting node does not receive any witnesses/counterevidences until the 
clock time has elapsed, it gives a negative/positive answer to ipk+i- Therefore G \= ipk+i if and 
only if G \-Fo ipk+i- □ 
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We next consider the complexity of the distributed evaluation. Theorem [3] is the fundamental 
result of this section. It shows the potential for distributed evaluation of first-order queries with 
logarithmic in-node time complexity, distributed time linear in the diameter of the graph, and 
polynomial amount of communication. 

Theorem 3. Let G be a graph of diameter A, with n nodes, and let ip be a first-order formula with 
V variables. The complexity of the distributed evaluation of the query (p on G by Q£fo is given by: 
IN-TIME/ROUND DIST-TIME MSG-SIZE #MSG/NODE 
O(logn) 0(A) O(logn) 0{n''+^) 

Proof, (sketch) 

We assume that if has i free variables, k bounded variables and c constants. So v = i -\- k. Let 

w = V -\- c. 

IN-TIME/ROUND 

We consider the complexity in the size of the graph. The query is partially evaluated on the 
local data (identifiers of neighbors) of O(logn) size. It is rewritten in a systematic fashion into 
sub-queries by instantiating variables. Both operations can be performed in O(logn) time. The 
searching on the query table and answer table (both of size 0{n^)) can be done in O(logn) time 
as well by binary searching. 
DTIME 

As shown in Theorem [H the distributed time for a query is 2 Aw, so the time complexity is in 

0(A). 

MSG-SIZE 

It is evident that MSG-SIZE is O(logn). 
#MSG/NODE 

During the distributed evaluation of queries, new queries can be generated by instantiating free 
and bounded variables. The total number of queries generated during the distributed evaluation 
is 0{Yl^=i which is 0(n^+^). So the number of queries and answers received by each node is 
0(n^^^). Therefore, the number of messages sent by each node is 0{n'"~^^). □ 

Note that the first-order query engine relies on a naive evaluation of queries. It can be optimized 
by taking advantage of the patterns in the query to limit the propagation of subqueries, but this 
does not affect the global complexity upper bounds. 

5 Distributed complexity of FP 

We next consider the complexity upper bounds for FP. It relies on a query engine which is defined 
as follows. Note that we first assume that the system is synchronous and we discuss asynchronous 
systems at the end of the present section. 

Query engine for FP, QSfp- At first, the requesting node broadcasts ^((/?(T)(xi, . . . , xi)) (where 
T is a relational symbol of arity i). It takes A rounds for all nodes to receive the query. In order 
to coordinate the computation of the stages of the fixpoint on different nodes, a hop counter c is 
broadcasted together with the query fj.{ip{T){xi, ■ ■ ■ , xi)), and a clock a is set for each node. Initially, 
the requesting node sets a = A, and broadcasts (/i((/?(T)(xi, . . . , xi)), A — 1) to its neighbors. Each 
node receiving messages of the form (/i((/?(r)(xi, . . . , xi)), c) sets a = c and propagates the formula 
{^{ip{T){xi, • • • 1 x^)),c — 1) to its neighbors, unless c = or o" has been set before. 



10 



When the clock a expires, each node a sets a local table for T and performs the recursion on 
fj,{(p{T)) by iterating the use of the first-order query engine QSfo on the query i^iT) as follows: 

• a sets a clock r = 2Aw (where w is the number of variables or constants in ip{T)), evaluates 
the query 7xi . . .?X£-ilaip{T){xi, . . . ,X£^i,a) using QSpOi which takes time 2Aw. 

• If a receives a query ?xi!a2 . . .\a£if{T){xi,a2, . . . , ) before r expires, xi is instantiated by a 
to get the subquery \ala2 ■ ■ .laiip{T){a, 02, . . . , a^), and the evaluation of the Boolean query 
7Bif{T){a,a2, ■ ■ ■ ,a£) starts. If a gets a positive answer to that Boolean query, it stores 
(a, 02, ... , ai) in a temporary buffer. 

• When the clock r expires, node a updates the local table for T and sets another clock rj = A. 
If some new tuples {a,a2, ■ ■ ■ ,ae) have been produced, a broadcasts an informing message to 
its neighbors, which will be propagated further to all the nodes in the network to inform them 
that the computation has not reached a fixpoint yet. 

• If some new tuples have been produced in a or a has received some informing messages when 
the clock r] expires, it resets r = 2Aw and starts the next iteration, otherwise the evaluation 
terminates. □ 

Definition 3. Let fi{(p(T)) be a fixpoint formula, G,I \-fp ^^{^{T)) if and only if upon request of 
fj,{<p{T)) from any node a, the query engine QSpp produces answer I distributed in the network. 

As for FO, we show that the query engine is sound and complete. 

Theorem 4. For a network G and fj,(ip(T)) a fixpoint formula, G,I \= ^{(p{T)) if and only if 
G,IhFP Kv{T)). 

The proof of Theorem U] follows easily from Theorem [2j 

Theorem 5. Let G be a graph of diameter A, with n nodes, T a relation symbol of arity i, and 
IJ,{ip{T){xi, . . . , Xi)) be a FP formula with v = i + k (first-order) variables (i free and k bounded). 
The complexity of the distributed evaluation of the query ^{ip{T)) by Q£fp on G is given by: 

IN-TIME/ROUND DIST-TIME MSG-SIZE #MSG/NODE 
O(logn) O(n^A) O(logn) 0(n^+''+i) 

Proof. Let w be the total number of variables and constants in ip{T){xi, . . . , Xi). 

Messages (^((/9(T)(xi, . . . ,X£)),hop) are transferred in the network, before the clock a expires, 
which takes 0(A) round and 0(1) messages for each node. 

Queries ?xi . . .?X£_i]a{p(T){xi, ■ ■ ■ ,X£_i,a) are evaluated after the clock a expires, before r 
expires. 0{n'") messages are sent by each node for each such query (there are at most v — 1 variables 
in ip{T){xi, • • • )3^£_i,a)) by Theorem[3l Since there are n such ?xi . . .?x^_i!a(/9(r)(xi, • • • ,x^_i,a) 
queries, the total number of messages sent by each node is 0{n^~^^). 

When T expires, each node sets a clock rj = A, and broadcasts informing messages to its 
neighbors if some new tuples are produced. Each node receives the informing message will broadcast 
it to its neighbors unless it has done that before. Each node sends 0(1) informing messages before 
rj expires. 

When r] expires, if a node has produced some new tuples or received some informing messages 
during the previous iteration, it starts the next iteration. 
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So before the evaluation terminates, in each iterating period 2Aw + A after the expiration of 
a, at least one new tuple in T is produced in some node, thus there are at most such periods 
before the termination of the evaluation since there are at most tuples in T. 

Consequently the total time of the evaluation is in A + n^{2Aw + A) = O(n^A). 

Because in each such period, 0{n'"~^^) messages are sent by each node, so the total number of 
messages sent by each node before the termination of the evaluation is 0{n^~^'"~^^). □ 

Although the complexity upper-bound for DIST-TIME and ^MSG/NODE is polynomial, the 
exponent relates to the number of variables. For most networking functionalities, this number is 
small, and the dependencies between the variables, might even lower it. 

The algorithm QSpp above can be adapted to an asynchronous system by using a breath- 
first-search (BPS) spanning tree (with the requesting node as the root), without impact on the 
complexity bounds. If an arbitrary spanning tree, not necessarily a BPS tree, is used, then the 
complexity bounds does not change, except the distributed time, which becomes 0(n^+^). 

Note that with QSpp, nodes are coordinated to compute every stage of the fixpoint simulta- 
neously by using the clock 2Aw, which is critical for preserving the centralized semantics of FP 
formulae. However if ip is monotone on T, the centralized semantics of the fixpoint is preserved no 
matter whether the stages are computed simultaneously or not. Similar results can be shown for 
alternative definitions of the fixpoint logic, such as Least Pixpoint. 

6 In-node behavioral compilation 

In this section, we see how to transform PO and PP formulae, which express queries at the global 
level of abstraction of the graph, to equivalent rule programs that model the behavior of nodes. 
We first introduce the Netlog language. 

A Netlog program is a finite set of rules of the form: 

(T) 70 : -71; • • • ;7i- 

where I > 0. The head of the rule 70 is an atomic first-order formula. The body, 71;...; 7; is 
constituted of literals, i.e., atomic (RCx)) or negated atomic formulae. Each atomic 

formula 7, has a holding variable, which is written explicitly as @x and specifies the node on which 
the evaluation is performed. The communication construct, t, is added before the head if the result 
is to be pushed to neighbors. 

In the sequel we denote the head of a rule r as headr and the body as bodt/r and denote the 
holding variable of a formula 7^ as hvj^ . The relations occurring in the head of the rules are called 
intentional relations. 

Some localization restrictions are imposed on the rules to ensure the eflFectiveness of the dis- 
tributed evaluation. 

(i) All literals in the body have the same holding variable; 

(ii) the head is not pushed (by t) if the holding variable of the head is the holding variable of the 
body; 

(iii) if the head is pushed (by |), assuming the holding variable of the head is x and the holding 
variable of the body is y, then G{@y,x) is in the body. 
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A Netlog program is running on each node of the network concurrently. All the rules are applied 
simultaneously on a node. The holding variable of literals in the body is instantiated by the node 
ID itself. Facts deduced are stored on the node if the rule is not modified by |. Otherwise, they 
are sent to nodes interpreting the holding variable of the head. 

On each node, (i) phases of executions of the rules on the node and (ii) phases of communication 
with other nodes are alternating till no new facts are deduced on each node. The global semantics 
is defined as the union of the facts obtained on each node. 

For a graph G = {V, G), an instance / such that I = \J 1^ where 1^ is the fragment of I stored 

on node v, a rule: 

r : QCx) : --Ri(yi); . . . ; Rm{iM);^Rm+i{ym+iy, • • • ; -^Ri{yi). 
and an instantiation a of the variables occurring in r, 

{l,cr) \=G -Ri(yi); ■ • • ; Rm{]M)]^Rm+i{ym+iy, -^Riim) 

if and only if 

R^icrmi ^^'^(^)^^' fo^^^ti'-] 



^ U G, for ie[m+l,l] 

where y is the holding variable of bodyr- 

We define the immediate consequence operator of a Netlog program P as a mapping from an 
instance / to an instance: 



*p,g(/) =lj{Qi' 



3r £ P : QCx) : —bodyr 
3as.t.{I,a) \=G bodyr', 
li = a{'x);a{hvQ^-^)) = v. 



The computation of a Netlog program P on a graph G is given by the following sequence: 

lo = 0; 

h+l = ^P,G(/*),i>0 

The computation of P on G terminates if the sequence (/j)i>o converges to a fixpoint. If the 
computation of P on G terminates, we define P(G) to be the least fixpoint obtained by the 
computation sequence (/i)i>o- 

Before we see how FO or FP formulae can be rewritten into Netlog programs, let us first 
illustrate the technique on the examples of Section [2l 

Example 1. The following program computes the OLSR like table-based routing protocol as defined 
in Section\^ 



T{@x,d,d) :- G{@x,d). 

T{@x, h,d) : - -^existT{@x, d); G{@x, h); askT{@x, h, d). 

existT{@x,d) : - T{@x,u,d). 

] askT{@x,h,d) : - T{@h,z,d);G{m,x);x ^ z. 

T{@x,d,d) :- T{@x,d,d). 
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New predicates (askT) are introduced to store partial results that are computed on some nodes, 
and used by other nodes to which they have been forwarded. The last rule ensures the inflationary 
behavior (accumulation of results). 

Example 2. The following program computes spanning trees as defined in Section\^ Several new 
predicates are introduced to reduce the complexity of the formula (delay, rej) and to ensure the 
transfer of data between the nodes involved in the computation (askST). 



]STix,@y) 
ST{x, @y) 
T askST{x, @y) 
existST{@y) 
rej{x' , @y) 
delay{x, @y) 
ST{x, @y) 



G{@x, y); ReqNode{@x). 
-^existST{@y); delay{x, @y); ^rej{x, @y). 
ST{w,@x);G{@x,y);w ^ y. 
ST{x,@y). 

askST{x, @y); askST{x' , @y);x' > x. 

askST{x,@y). 

ST{x,@y). 



Example 3. The following program computes the AODV like on-demand routing protocol as defined 
in SectionlM 



I RouteReq{x, @y, d 
RouteReq{x, @y, d 
t askRouteReq{x, @y, d 
existRR{@y, d 
t Nexthop{@x, d, d 
t Nexthop{@x, y, d 
RouteReq{x, @y, d 
Nexthop{@x, d, d 



G{@x, y);ReqNode{@x); dest{d). 
askRouteReq{x, @y, d); ^existRR{@y, d). 
RouteReq{w, @x, d); G{@x, y); x ^ d;w ^ y. 
RouteReq{w' , @y, d). 
RouteReq{x, @d, d);G{@d, x). 
RouteReq{x, @y, d); Nexthop{@y, z, d); G{@y, x) 
RouteReq{x, @y, d). 
Nexthop{@x,d, d). 



We now consider the general translation of FO and FP formulae to Netlog programs. It has been 
shown in [2| that FP is equivalent to Datalog^ both with inflationary semantics. Moreover, both 
FO and FP formulae can be translated effectively to Datalog^ programs. We therefore consider 
the translation of Datalog^ programs into equivalent Netlog programs. The main difficulty relies 
in the distribution of the computation. 

The syntax and semantics of Datalog^ is similar to the one of Netlog, but without the com- 
munication primitives. Indeed, unlike Netlog, a program in Datalog^, is processed in a centralized 
manner. The computation of a Datalog^ program P on a graph G is given by the following se- 
quence: 

lo = 0; 

li+i = ^p,G{I^)Uli,i>0 

where ^I'p,G(-^^) is defined in a similar way as for Netlog. 



14 



The following algorithm rewrites a Datalog^ program Vdl into a Netlog program Vnl- To 
synchronize stages of the recursion, there is a fact ^^start(a)" stored on each node a at the beginning 
of the computation which triggers a clock used to coordinate stages. 

In the sequel we do not distinguish between G{@x,y) and G{@y,x). 
Rewriting Algorithm: 

The algorithm rewrites the input program step by step. 
Step 1: Distributing Data 

Input: Vdl- Output: Vi. 

Algorithm LocalizeiVDi) chooses one variable as the holding variable for each relation in Vdl- 
Vi is obtained by marking the holding variable of each literal in Vdl- 

The Rewriting Algorithm supports different assignment of holding variables. For simplicity, we 
assume the left most variable of each relation is chosen as holding variable. For lack of space, we 
do not address the associated optimization problem. 
Step 2: Distributing Computation 

Input: Vi- Output: < P2-,k > 

Let A be the diameter of G. 

For each rule r eVi, assume 

• hvheadr is the holding variable of headr, 

• hr := hv headr, and 

• CNr := {hr}. 

Rewrite{r,hr,CNr) recursively rewrites the rule r into several rules until the output rules 
satisfy the localization restriction (i). bodyr is divided into several parts: the local part that can 
be evaluated locally and the non-local part that cannot be evaluated locally, hr is the holding 
variable of the literals in the local part. The non-local part is partitioned into several disconnected 
parts which share no variables except the variables in CNr and are evaluated by additional rules 
Tj on different nodes in parallel. The deduced facts of rj arc pushed to the node where the rule r 
is evaluated. Meanwhile, it calculates the number of rounds Kr for evaluating r. 
Rewrite(r, hr, CNr) : output < Tr,Kr > 

Begin 

Assume 

r : 7 : -71; . . . ;7i. 

where / > 1. 

Let S = {71, . . . , 7;}, S' = {7i|7i G S and hv^^ = hr}, so that S' contains all the literals in bodyr 
whose holding variable is the same as the one of the head, hr- 

- If S" = 5*, then Tr := {r}, and Kr := 1. 

- If 5' / S, 
Begin 

Let S" := S — S' , so that S" contains all the literals in bodyr whose holding variables are not 
hr- 
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For '^ij-^k G S", let 'jj ~ 7^ if 7^ and 7^ have some common variables besides the variables in 
CNr- Assume {S", . . . , S"} (n > 1) is a partition of S" in minimal subsets closed under ~, 
so that the literals in 5" are divided into disconnected "subgraph" components. 

For each S'/,i G [l,n], let 

Ti ■= {hv^Jliw e Si' and G{@hr,hv^,J G S'}. 

so that Tj contains the variables which are the holding variable of one literal in S'/ and are 
also a neighbor of hr- 

- liTi ^ 0, which means the non-local part S'/ is connected with the local part S'. Choose 

one variable hv^.^ from Tj. Let S'/ := S'/ U {G{@hv^^^,hr)}. Let /i^j := hv-y^^. Let 
CNri := CNr U {hrj. Let d,.. := 1. Assume S'- = {7^,1, . . . ,7i,mi}- Let 

"^i '■ QiiVi) '• ~Ti,l' ■ ■ ■ !Ti,mi- 

where Qi is a new relation name and yj contains all the variables occurring both in S'/ 
and in either S' or headr, that is in var{S'/) fl {var{S')Uvar{headr)), with hr as holding 
variable. 

- If Tj = 0, then the non-local part S'/ is disconnected from the local part S'. Choose one 
literal G S". Assume y is a variable not occurring in r, let S'/ := S'j'U{y = hv^.^}. Let 

:= hv^.^. Let CNr^ := CNr U {/irj- Let := 1 + A. Assume 5f = {72,1, . . . ,7i,mi}- 

Let 

''i ■ QiiVi ) • ~Ti,l' ■ ■ ■ ' Ti,mi ■ 

where is a new relation name and 'yt contains all the variables occurring both in S'/ 
and in either 5" or headr, that is in var{S'/) fl {var{S') Uvar{headr)), with y as holding 
variable. Moreover, let 

ri:Qi{@x...):-Qi{@y...y,G{@y,x). 

Assume 5" = {7^, . . . ,7^} (/c > 0), let 

r' : 7 : -7^; . . . ; 7][.; Qi(yi); . . • ; Qnivt)- 

QiiVi), i G [l,n], is called sub-query. 

Assume < Tr^,Kr^ >= Rewrite{ri,hr^,CNr^), let 

. Tr:={r'}U U ({rauT.J, and 

ie[l,n] 

• Kr '■= max{Kri + C^rj K ^ [l''^]}' 

End 

End 

Finally, let 

• P2 := U T^, and 

reVi 
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• K := max{A,max{Kr\r G 'Pi}}- 

Step 3: Communication 

Input: <V2,K>. Output: <V3,k>. 

Vs is obtained by adding | in the head of each rule r where r € 7^2 with the holding variable of 
the head different from the holding variable of the body. So that rules in satisfy the localization 
restriction (ii) and (iii). 
Step 4: Stage coordination with clocks 

Input: < V^jK >. Output: ^4. 

The rules in V3 are modified as follows: 

- Add the literals "clock{@x,qy' and "g 7^ 0" to the body of each rule, where x is the holding 
variable of the body. 

- For each rule with an intensional relation R of Vdl in its head, replace R in the head with 

tempR and add 



continue{€ 


^x) 


- start{@x). 


T inf{@y 


x) 


- start{@x);G{@x,y). 


clock{@x 


k) 


- start{@x). 


clock{@x 


,p) 


— clock{@x, q); q > 1; p = q — 1; -'stop{@x). 


clock{@x 


k) 


- clock{@x,0);^stop{@x). 


t inf{@z 


x) 


- inf{@y, x); G{@y, z);x ^ z; clock{@x, q);q> A 


continue{€ 


*x) 


— inf{@x,y)\clock{@x,q);q^O. 


continue{1 


^x) 


— continue{@x); dock{@x, q);q^O. 


stop{€ 




— -icontinue{@x);clock{@x,0). 



in Va. 

Step 5: Inflationary result 

Input: V4. Output: Vnl- 

Vnl contains rules in and the following rules: 

- For each relation R in 7^4 except start, clock, continue, inf and stop but not in Vdl, add 



RCx) 
continue{@x) 
T inf{@y,x) 



tempRilc); clock{@x, 0). 
tempRiJr); -1 clock{@x, 0). 
tempR{'x); -^R{'x);clock{@x, 0); y). 



in 7^4 where x is the holding variable of both R and tempR. 



- Add 



R{...@x..:) : -R{. ..@x... y,clock{@x, q);q / 0. 



in Vnl- 
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For each intensional relation R of Vd l , add 

RCx) : -Rile). 

in Vnl- □ 



It is obvious that each rule in a program Vnl produced by the Rewriting Algorithm satisfies 
the localization restrictions, and can thus be computed effectively on one node. We can now state 
the main result of this section which shows that the global semantics of Vdl coincides with the 
distributed semantics of Vnl- 

Theorem 6. For a graph G={ V, G}, a Datalog program Vdl CLnd its rewritten Netlog program Vnl 
produced by the Rewriting Algorithm, the computation of Vnl on G terminates iff the computation 
of Vdl on G terminates, and Vnl{G) = Vdl{G). 

Vnl slows down the computation of Vdl- During one stage (k rounds) of the computation of 
Vnl, the clock turns from k to 0, the sub-queries are evaluated and the sub-results are transmitted. 
At the end of each stage, the deduced facts for the intensional relations of Vdl are cumulated and 
all the sub-results are cleared. Hence, one such stage of Vnl is equivalent to one stage of Vdl- For 
an intensional relation R of Vdl, R{~c) £ Idli if and only if R{~c) G lNLi{K+i)+i, ^ ^ 0, where 
iDLi and In Li are the stages of respectively the fixpoints of Vdl and Vnl- 

The termination of the computation of Vnl is ensured by the predicate stop as follows: the 
computation starts with a fact start{a) on each node a, which triggers clock{a, k), continue{a) and 
inf{b,a) where 6 is a neighbor of a. When the clock decreases from k to 0, the evaluation of the 
sub-queries is done. The facts of an intensional relation R of Vdl are stored in tempR. Meanwhile, 
inf{v, a) is pushed to all the other nodes v to inform that the computation on a continues, so that 
continue{v) is deduced. continue{a) for one stage is maintained to the end of the stage. When the 
clock turns to zero, (i) the program checks if continue{a) is true. If false, stop{a) is deduced. Since 
—>stop{a) is a precondition for decreasing the clock and the clock is a precondition for deducing 
facts of all the other relations except R, so only the facts of R are preserved along the stages. Thus 
the fixpoint is obtained and the computation terminates. Otherwise {stop{a) is not deduced), the 
computation continues, (ii) The programs compares facts of tempR and R. If there are newly 
deduced facts, these facts are added into R. Meanwhile continue{a) and inf{b,a) are deduced for 
the next stage. 

The proof of Theorem [6] relies on the following Lemma and the fact that the Rewriting Algorithm 
produces only rules satisfying the localization restrictions. 

Lemma 7. For a graph G={V,G}, a Datalog program Vdl CLnd its rewritten Netlog program Vnl 
produced by the Rewriting Algorithm, the computation sequence {lNLj)j>o for Vnl satisfies: 

1. For each relation R in Vnl, G Inlp iff R{~c) G Inlp,ci o,nd R{~c) ^ Inlp,c', where ci 
is the holding node of R{~c) and d ^ c\. 

2. Inlo = {start{v)\v G V}. 

3. If clock{a, c) G Inlp, then clock{v,c) G Inlp for all v ^V. If stop{a) G Inlp, then stop{v) G 
Inlp for all v £ V. 
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4- If stop{a) G InLs, then (i) clock{a, k) G InLs, (H) for q G dock{a, k — p) G iNLg, 

p G [0, k], iff q = n{K + 1) + p + 1 and (in) if RCc) G Inlj where f > s + I, then R is an 
intensional relation ofVoL- Continue{a) ^ lNLn{k+) for any a £ V and any p > s — {k + 1) 
iff stop{a) G Inls- (an) continue{a) ^ Inls- 

5. For each relation R in Vnl but not in Vdl, except the relations start, clock, continue, 
inf and stop, (i) if R{~c) G iNLp, then clock{a,K) ^ iNLp, and (ii) if p = n{K + 1) + q, 
g G [2, K + 1] , then R{'c) G lNLn{K+i)+q' , ^' G « + 1] ■ 

6. For each intensional relation R of Vdl, if R{~c) S Inlp then RCc) G Inlp' where p' > p. 
Assume q = min{p\R{~c) G iNLp}, then clock{a,K) G iNLq- 

Now we prove Theorem [H 

Proof. Assume the computation sequence for Vdl is {lDLi)i>Q and for Vnl is {lNLj)j>o- We prove 
for any intensional relation Q of Vdl, Q{~c) £ lNLi{K+i)+i iff QC^) ^ loLi- 
Basis: i = 0, Idlo = and Inli = {continue{a),inf{b,a),clock{a,K)\a G V,G{a,b)}. 
Induction: Suppose for n > 0, and each intensional relation Q of Vdl, 

Q{ai, ... ,ak) G Idlu iff Q{ai, ■ ■ ■ ,ak) G lNLn{K+i)+i- 

First we proof that for n + 1, if Q{bi, . . . , 6^) G Idlu+i, then Q(&i, . . . , 5^) G lNL(n+i){K+i)+i- 
If ■■■ ,bk) G Idlh+i, then (i) . . . , 6^) G or (ii) Q(5i, . . . , 6^) is a newly deducted 

fact in Idlu+i- 

If Q{bi, . . . ,bk) G /dlti then Q{bi, . . . ,bk) G -fAri,n(K+i)+i by the induction hypothesis, and 
Q{bi, ...,bk) G liVLp wherep > n(K+l)+l by LemmaElG, therefore Q{bi, ... ,bk) G /ArL(n+i){K+i)+i- 
Otherwise((5(&i, . . . , 6fc) ^ Idlu), then there is one rule r G Vdl 

r : Q{xi,...,Xk) : -Ri{m); • • • ; Rm{iM); -^Rm+i{yrn+i); -^Ri{m). 

and an instantiation a of the variables in r such that cr(xj) = 6j for i G [1, k] 

P / / G ^DLn U G, for i G [1, m] 

^/^^„uG, foriG [m+l,Z] 

and for some e G i?e(cr(y^)) ^ Idlu-i U C By the induction hypothesis and Lemma[7ll, 

^ _ f G /ArLn(K+l)+l,<7{tojjJ U G, for i G [l,m] 
' I ^ -^ArLn(K+i)+i,<7{toflJ U G, for z G [m + 1, Z] 

and Re{a{yl)) ^ lNL(n-\){n+\)^\,u(hvR^) U G. According to Lemma[7l6 

^.l^iy^JJ j ^/^^p,,(,,^^)UG, foriG[m+l,/] 

where p G [n(K + 1) + 1, (n + 1)(k + 1)] and Re{a{y^)) is newly deduced in lNLn{K+i)+i- So 
continue{a{hvR^)) G /Ari„(K+i)+i. By Lemma [714, stop{a) ^ -?^ArLn(K+i)+i and clock{a,K — p) G 
-^ArLn(K+i)+i+p for p G [0, k] and for any aeV. 
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Because 

xi, . . . , X};) : teTnpQ(^@xi, ■ ■ ■ , X}^ );clock{@xi,0). 

is in Vnl, so iftempQ{bi, . . . ,6fc) G n(K+l) + l+p, p G [k^, k], then . . . ,6^) G /ArL(„+i)(K+i)+i 
since Kr < k. 

According to Rewriting Algorithm, = hvq, CN^ = {hr} and 

• if all the holding variables of the literals in bodi/r are the same with hr (S" = S), then 

tempQ{@xi, ...,Xk): -Riijjt)] ■ ■ ■ ] Rm{yZ)-,^Rm+i{lM+i); ■ ■ ■ ] -^Riiyi);clock{@xi,q);q / 0. 
and 

tempQ{@xi, . . . , Xk) ■ —tempQ{@xi, . . . , Xk); clock{@xi,q); q ^ 0. 

are in Vnl- i^r = 1- Therefore tempQ{bi, . . . ,bk) G lNLn(K+i)+i+p for each p G and 
Q{bi,...,bk) G 

-^AfL(n+i)(K+i)+i by Lemma[711 and [7l5. 

• Otherwise, not all of the holding variables of the literals in body^ are the same with 
(S' / S). Assume hvR^ = ■■■ = hvR^ = hvR^^-^ = ■■■ = hvR^^.^ = hr- Then 

tempQ{@xi, ... ,Xk) : . . . ; Rwiiu); -^Rm+iiUm+i); • • • ; -^Rm+uiUm+u); 

Qiizt); . . .■,Qt{zt);clock{@xi,q);q 7^ 0. 

and 

tempQ{@xi, . . . , Xk) ■ —tempQ{@xi, . . . , x^); clock{@xi,q); q ^ 0. 

are in Vnl where QiCzt) is in headr^ for G Vnl. If for each i G Qi{ci) G lNL'n{i^ + 

1) + 1 + (kj. — 1), where = aCzi), then tempQ{bi, . . . ,bk) G lNLn{K, + 1) + 1 + k,., then 
tempQ{bi, . . . ,bk) G Inl^^k + 1) + 1 + p, p ^ [nr , k]. is as follows: 

Literals . . . , Rm{y^), ^Rm+u+i{ym+u+i) , • • • , ^Riim) are grouped into subsets 

. . . , 5^, such that he literals in different subsets have no common variables except the 
variable in CN^ which is xi. 

For each S'/, 

— if some of the holding variables of the literals in S" are the neighbors of hr , (Tj 7^ 0) , then 
G(@hv^^ ,hr) where hv^- is one of such variables, is added into S'/. Then hr = hv^^ 
and CNr^ = CNrU{hrJ. Literals in 5f along with "clock{@hv^^^,q)" , "q ^ 0" constitute 
bodun. "T Qi(^)" constitute headr^ where 'zt contains all the variables both in bodt/n 
and in any of Ri{yt), ■■■ , Rwiv^), -^Rm+i{ym+i), ^Rm+u{ym+u) or headr, with hr 
as the holding node. If the evaluation for rj is finished, the result for the sub-query Qi 
gets to a{hr) in the next round, dj.. = 1. 

— Otherwise (non of the holding variables of the literals in S'l are the neighbors of hr), 
"y = hv-y.^" is added into S" where 7^4 is one literal in S^' and y does not occurs in r. 
hr,, = hvy,^. CNr, = CNr U {/i^J. Literals in S'l along with " dock{@y,q)" , "y / 0" 
constitute bodyn- headr^ is "| Qii^zly^ where 'zl contains all the variables both in body^ 
and in any of Ri{m), Rwiiu), ^Rm+iiym+i), • • • , -^Rm+uiym+u) or headr, with y 
as holding node. 
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Moreover, because the following rule is in Vnl 

T Qi{®x ...): -Qi{@y . . . ); G{@y, x); dock{@y, q); q / 0. 

therefore if the evaluation of rj is finished, then the result for the sub-query Qi is obtained 
locally in the next round and then is broadcast to every node in A rounds, dr^ = 1 + A. 

For each i G [l,t], if Qi{a{hr^) ...) £ lNLniK+i)+i+Kr^, then Qi{ct) G lNLn{K+i)+i+{Kr^+dr^)-i, 
and because Hr = max{Kri + d^}, so Qi{c^) G lNLn{K+i)+i+(p'-i) for p' G [k^- + d,.-, k^]. 

Each Tj is then rewritten by Rewrite function and the output rules are modified by the 
Rewriting Algorithm. 

A set of rules G Vnl is obtained by applying the Rewriting Algorithm on r. For r' G T,. with 
some sub-queries 

Kr' = max{Kj.i + d^i \r' G and is a sub-query of r'}, 

and for r" G without sub-queries Kj." = 1. The answers to r G is in lNLn(K+i)+i+Kr- Therefore 
Qi{a{hri) . . .) e lNLn{K+i)+i+Kr^ for each i G [l,t], so finally 5^) £ lNLin+i)ik+i)+i- 

Then we proof that if Q{bi, . . . , 6^) G lNL{n+i){K+i)+i then Q{bi, ...,bk)e loLn+i for n + 1. 

If Q{bi, ...,bk) G lNL(n+i){K+i)+i, then 

(i) Q{bi, ... ,bk) e lNL{n+i){K+i) or 

(iii) tempQ{bi, ...,bk)e lNL{n+i){K+i) and clock{bi,0) G /ArL(n+i){K+i)- 

If Q{bi, . . . G /ArL{n+i) (/€+!)> according to Lemma [7l6, Q(6i, . . . G /ArL„(«;+i)+i- By the 
induction hypothesis, Qibi, . . . , G loLn, so (5(6i, ... ,bk) G Idlu+i- 
Otherwise . . . , 6^) ^ ^AfL(n+i)(K+i)), 

. . . ) : -tempQ{@x . . . );clock{@x, 0) 

is in Vnl, tempQ{bi, . . . ,bk) G /ArL(n+i)(K+i), clock{bi,0) G /ArL(n+i)(K+i)- Therefore siop(a) ^ 
-^ArL(n+i)(ft+i)+i, and by Lemma [714, clock(a,K - p) € InLq, P S [0, k] iff g = n(K + 1) + 1 + p. A 
set of rules in Vnl of the following form, with Qi{xi), Quiza), • • • , Qio{zi^) as sub-queries, is used 
and only used for deducing tempQ{bi, . . . ,bk) 

{])Qi{xi) ■ -Riiivii); ■ ■ ■ ; Rim{yi^);^Rim+i{yim+i); ■ ■ ■ ;-'Rii{yii); 

Qiiizii); . . .■,Qioizk,);clock{@y,q);q / 0. 

and tempQ{bi, . . . ,bk) G Inlp, P £ [p', {n + 1)(k + 1)] for some p' G [2, (n + + 1)]. 

According to Rewriting Algorithm, all of these rules are rewritten from a rule in Vdl with 
Q(xi, . . . , Xfc) as the head and the literals Rtijjt) and ^Ruijju) occurring in these rules as the body. 
Assume the rule is 

r : Q{xi,...,Xk) : -Ri{yt)] . . . ; RmilMi)] -^Rm+i{ym+i); • • • ; ^Riijji). 
By Lemma [7l6, 

p / ^ lNLn{n+i)+i U G, for f G [1, m] 

m^[yi))<^ ^ W(,+i)+iUG, foriG[m+l,/] 
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where (j{xi) = hi for i G [1, A;], and for some e G [l,m], Re{(T{ye)) ^ lNL{n-i){K+i)+i U G. By the 
induction hypothesis 

G -^DLn U G, for i € [1, m] 
i iDLn^G, for z G [m+ 1,Z] 



and Re{a{yl)) ^ Idlu-i U G. So (5(6i, . . . , 6^) G Idlu+i- 

Therefore for an intensional relation Q oiVoL, QCc) £ -^DLi if and only if Q{'c') G i^ArLi(K+i)+i- 
We now proof that the computation of Vnl on G terminates iff the computation of Vdl on G 

terminates. 

The computation of Vdl on G terminates, 

iff 

ilDLj)j>Q converges, 
iff 

no new facts in any intensional relation of Vdl are deduced in Idli for the minimal i, 
iff 

no new facts in any intensional relation of Vdl are deduced in lNLi{K+i)+i the minimal i, 
iff 

conUnue{v) ^ lNL{i+i){K+i) and continue{v) G iNLiin+i), 
iff 

stop{v) G lNL{i+l){K+l)+l, 
iff 

cl0Ck{v,c) ^ lNL(i+l)(K+l)+2, 
iff 

only the facts in the intensional relations of Vdl are in Inlp, P > (i + + 1) + 2, 
iff 

{lNLj)j>o converges, 
iff 

the computation of Vnl on G terminates. 

Therefore the compTitation of Vnl on G terminates iff the computation of Vdl on G terminates 
and Vnl{G) = Vdl{G). □ 



7 Restriction to neighborhood 

We next consider a restriction of FO and FP to bounded neighborhoods of nodes which ensures 
that the distributed computation can be performed with only a bounded number of messages per 
node. 

Let dist{x,y) < k he the first-order formula stating that the distance between x and y in the 
graph is no more than k. Let J\f^{x) = {y\dist{x,y) < k} denote the k -neighborhood of x. 

Let (p{x,^) be an FO formula with free variables x,^, then (p^^\x,^) denotes the formula 
with all the variables occurring in if relativized to the /^-neighborhood of x, that is each quantifier 
is replaced by V/3z G J\f'^{x), and y G M^{x) is added for each free variable y. 

The local fragments of FO and FP can be defined as follows. 

Definition 4. FOioc is the set of FO formulae of the form (p^'^\x, 'y'). 
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The local fragment of FP can be defined as fixpoint of FOioc formulae. 

Definition 5. FPi^c is the set of FP formulae of the form ^{ip^^\T){x, ~y )), where 1/ = yi . . . 
and T is of arity i + 1. 

Consider again the examples of Section [2j It is easy to verify that the formula fi{(p(T){x, h, d)) 
defining the OLSR like table-based routing is not in FPioc- On the other hand, the formula 
n{ip{ST){x, y)) defining the spanning tree is in FPioc, as well as the formulae fj,{ip{RouteReq){x, y, d)) 
and ^{ip{NextHop){x,y,d)) defining the AODV like On-Demand Routing. 

7.1 Distributed complexity 

We now show that the distributed computation of the local fragments, FOioc and FPi^c, can be 
done very efficiently. We assume that the nodes are equipped with ports for each of their neighbors. 
The ports allow to bound the message size to a constant independent of the network size. The 
proof relies as previously on specific query engines for FOioc and FPioc- The query engine for FOioc 
works both in synchronous and asynchronous systems. 

Query engine for FOioc {Q^FOioJ The requesting node broadcasts the FOioc formula ip^^\x, if). 
For each node a, when it receives the query ip^^\x^ if), it collects the topology information of its 
/c-neighborhood by sending messages of 0(1) size, then evaluates ip'^^\a, if) (where x is instantiated 
by a) by in-node computation. Since all nodes collect their /c— neighborhood topology information 
concurrently, these computations may interfere with each other. To avoid the interferences between 
concurrent local computations of different nodes, the traces of traversed ports are incorporated in 
all messages. 

Each node collects the topology of its /^-neighborhood as follows. 

• For each node a, when it receives the query ip^^\x, if), it sends a message ("collect", k, j) to 
its neighbor though port j, and waits for replies. 

• Upon reception of a message ("collect", i, ji---j2{k-i)+i) by port a adds ji---32(k-i)+ij' into 
a table tracelista, and 

— if i > 0, a sends on each port j" s.t. j" 7^ j' the message ("collect", i—l, 3i---j2{k-i)+i3' j")-, 
and waits for replies; 

— otherwise(i = 0), a sends on port / the message ("reply", ji32---j2k+i-, j' , tracelista)- 

• Upon reception of a message ("reply" , ji . . . 32r+i-, j2r+2 ■ ■ ■ j2k+2, tracelist'i . . . tracelist'j^_^_^^) 
on port j2r+i) and replies from all the other ports have been received 

— if r = 0, for l<s<A; + l,a stores in the local memory (ji....j2s; tracelist'^); 

— otherwise a sends on port j2r a message ("reply", ji ■ ■ ■ j2r-i, 32r ■ ■ ■ 32k+2-, tracelista 
tracelist[. . . tracelist'f^_^^^). 

• After receiving replies from all ports, a computes the topology of the /^-neighborhood of a by 
utilizing the stored tuples {ji...j2r,tracelist)as follows: 

Let 

T^{a) := {ji...j2r\{ji---j2r,tracelist) is stored in local memory of a}. 
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Define an equivalence relation ?a on T^{a) as follows: let ji---j2r, ji---j2s ^ '^^{o,)^ then 
ji"-j2r ~ ji-'-jL if and only if 3 {ji...j2r,tracelist), {j[...j2s,tracelist') s.t. ji...j2r ^ tracelist' 
or j[...J2s S tracelist. 

The vertex set of the /j-neighborhood of a is 

{ji-j2r|ji...j2r G T'=(a),r < A;} / 

namely equivalence classes [j'l • • ■ j2r] of ~ on elements ji...j2r {f < ^) of T^{a). 
Let [ji...j2r]i [ji---j2s] be two vertices of the A;- neighborhood of a, then there is an edge between 
bi-"j2r] and [j'l-. ■32s] if and only if there is jj^jl-.-jli+iili+a ^ ^''(a) such that « ji...j2r 

andii*...j|i+2 n 

We can now state our main result for FOioc- 

Theorem 8. Let G = {V, G) be a network with n nodes and diameter A. FOioc formulae ip^''\x, if) 
can be evaluated on G with the following complexity upper bounds: 

IN-TIME/ROUND DIST-TIME MSG-SIZE #MSG/NODE 
0(1) 0(A) 0(1) 0(1) 

Note that the distributed time 0(A) comes from the initial broadcasting of the formula. The 
computation itself is fully local, and can be done in 0(1) distributed time. In the case of an 
asynchronous system, DIST-TIME is bounded by 0(n). 

We now consider -FP^oc which admits the same complexity bounds as FOioc except for the 
distributed time. We first assume that the system is synchronous, and discuss the asynchronous 
system later. 

Query engine for FPioc {QSfPioJ 

Request flooding The requesting node sets a clock a of value A and broadcasts the message 
{fj,{(p^''^ {T){x, if)), A— 1) to its neighbors. For each node a, if it receives message {^{lp^^\T){x, lf)),c) 
and it haven't set the clock a before, then it sets a clock a of value c, and if c > 0, it broadcasts 
message {^{lp^^\T){x, ~y)),c — 1) to all its neighbors. 

Topology collection When the clock a expires, each node a sets a clock a' of value Ak and starts 
collecting all the topology information in its 2/c-neighborhoods by sending messages and tracing 
the traversed ports (like for Theorem [8]). Now each node a gets a 2A;- local name for each a' in 
its A;-neighborhood, which is the set of traces from a to a' of length no more than 2k, denote this 
2fc-local name of a' at a by Name'^{a'). 

Fixpoint Computation In each node a, there is a local table to store the tuples (a, b) in T, 
which uses the fc-local names Name^{a') of a'. 

When the clock a' expires, each node a sets a clock t = 3k and starts evaluating the FO 
formula (p^''\T){a, if) (where x is instantiated by Name'^{a), the 2/c-local name of a at a). Node 
o evaluates (p^''\T){a,lf) by instantiating all the (free or bounded) variables in ip^''\T){a,lf) by 
its 2A;- local names Name'^{a') for nodes in its /c-neighborhood and considering all the possible 
instantiations one by one. 

Suppose a instantiates (x, if) by (a, h ) and also instantiates all the bounded variables, then a 
variable-free formula ip is obtained. Since there may be atomic formulae T(a', 6'), a should send 
the query ?BT(a', h' ) to a', then a' should check whether T(a', b' ) holds or not and send the answer 
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to a. It works since from Name^{h'j), the 2A;-local names of 6^ at a, a' can get Name^,{h'^, the 
fe-local names of 6^ at a' . 

During the above evaluation of (p^^\T)(a, 'y), if a new tuple (o, h ) satisfying ({)'^^\T){x, if) is 
obtained, a stores it in a temporary buffer (the local table for T will be updated later) by using the 
/c-local names of a and b at a, and sends messages to inform other nodes in its ^-neighborhood 
that new facts are produced. 

For each node a, when the clock r expires, it sets the value of r by 3A; again; if some new tuples 
are produced, a updates the local table for T, and empty the temporary buffer; if some new tuples 
are produced or some informing messages are received, it evaluates ip^^\T){a, "y^) again. □ 

Theorem 9. Let G = (y, G) he a network with n nodes and diameter A. FPioc formulae 
)) can be evaluated on G with the following complexity upper bounds 

IN-TIME/ROUND DIST-TIME MSG-SIZE #MSG/NODE 
0(1) 0{n) 0(1) 0(1) 

Proof. It is easy to see that messages sent during the computation of QSfPi^^ are of size 0(1). 
Before the clock a expires, it is evident that each node sends only 0(1) messages of the format 

{lJ'{'^{T){x,~y)),c). 

Then each node sets the clock a' and collects topology information of its 2A;-neighborhood, since 
the degree of nodes is bounded and in the 2fc-neighborhood of a there are only 0(1) nodes, each 
node sends only 0(1) messages as well. 

After the clock a' expires, each node a sets the clock r and starts evaluating Lp^^\T){a,^). 
During each period ?>k of r, node a considers all the possible instantiations of the (free or bounded) 
variables in (p^''\T)(a,lj) one by one and evaluate the instantiated formula. During each such 
period, since the total number of different instantiations are 0(1) and only 0(1) messages are sent 
during the evaluation of each such instantiated formula (p^''\T){a, b ), the total number of messages 
sent by a is 0(1). 

Moreover, after the clock a' expires and before the distributed computation terminates, each 
node a only sends 0(1) messages: a only be able to receive informing messages from nodes in its 
/c-neighborhood, the total number of tuples (a, b ) produced on nodes in the /^-neighborhood of a 
is 0(1), so the total number of informing messages received by a is 0(1), consequently a evaluates 
if^'^^x, if) at most 0(1) times, thus the total number of messages sent by a is 0(1). 

After the clock a' expires, during each period 3A: of clock r, there should be at least one informing 
message sent by some node, which means at least one new tuple in T is produced. Since there are at 
most 0(n) number of tuples in T, the total distributed time for the evaluation of fi{(p^''\T){x, if)) 
is 0(n). □ 

For asynchronous systems, a spanning tree rooted at the requesting node can be used to evaluate 
FPioc, and the complexity bounds DIST-TIME and #MSG/NODE become respectively 0{n^) and 
0{n). 

7.2 Networks with no global identifiers 

The query engines QEpOioc Q^fPi^c evaluate FOioc and -FP^oc queries by using only local 
names in the bounded neighborhoods of nodes, which suggests that for the evaluation of the local 
fragments of FO and FP, unique global identifiers for nodes are unnecessary. In this section, we 
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show that this is essentially the case, and consider their evaluation on networks with identifiers 
which are only locally consistent and on anonymous networks with ports. 

Definition 6. A network G = {V, G, L) with a labeling function L : V ^ C assigning identifiers 
to nodes, is k-locally consistent if for each node a , for any bi, 62 € M^{a), L{bi) ^ -^(^2)- 

Ports have been used to construct local names in the previous sub-section. They are not needed 
to evaluate FOioc and FPioc on locally-consistent networks since these networks have locally unique 
identifiers for nodes. 

Theorem 10. A FOioc formula ip^'^^ {x, ~y) can be evaluated on k-locally consistent networks with 
the following complexity upper bounds: 

IN-TIME/ROUND DIST-TIME MSG-SIZE #MSG/NODE 
0(1) 0(A) 0(1) 0(1) 

Theorem 11. A FPioc formula ix{f^'^\T){x, 'y)) can be evaluated on k-locally consistent networks 
with the following complexity upper bounds: 

IN-TIME/ROUND DIST-TIME MSG-SIZE # MSG/NODE 
0(1) 0(n) 0(1) 0(1) 

Local fragments of FO and FP can also be evaluated with the same complexity bounds on 
anonymous networks with ports since local names can be obtained by tracing the traversed ports 
of messages. 

Note that in general, FO and FP queries cannot be evaluated over locally consistent or anony- 
mous networks. 

8 Conclusion 

Fixpoint logic expresses at a global level and in a declarative way the interesting functionalities 
of distributed systems. We have proved that fixpoint formulae over graphs admit reasonable dis- 
tributed complexity upper-bounds. 

Moreover, we showed how global formulae can be translated into rule programs describing the 
behavior of the nodes of the network and computing the same result. The examples given in 
the paper have been implemented on the Netquest system which supports the Nctlog language. 
Finally, we proved the potential of restricted fragments of fixpoint logic to local neighborhood, 
that are still very expressive, but admit much tighter distributed complexity upper-bounds with 
bounded number of messages of bounded size, independent of the size of the network. 

These results show how classical logical formalisms can help designing high level programming 
abstractions for distributed systems that allows to state the desired global result, without speci- 
fying its computation mode. We plan to pursue this investigation in the following directions, (i) 
Investigate the distributed complexity of other logical formalisms such as monadic Second Order 
Logic, which is very expressive on graphs, (ii) Study the optimization of the translation from fix- 
point logic to Netlog, to obtain efficient programs, (iii) Extend these results to other distributed 
computing models. 
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